BlaPred: Predicting and classifying β-lactamase using a 3-tier prediction system via Chou's general PseAAC

J Theor Biol. 2018 Nov 14:457:29-36. doi: 10.1016/j.jtbi.2018.08.030. Epub 2018 Aug 20.

Abstract

Antibiotics of β-lactam class account for nearly half of the global antibiotic use. The β-lactamase enzyme is a major element of the bacterial arsenals to escape the lethal effect of β-lactam antibiotics. Different variants of β-lactamases have evolved to counter the different types of β-lactam antibiotics. Extensive research has been done to isolate and characterize different variants of β-lactamases. Unfortunately, identification and classification of the β-lactamase enzyme are purely based on experiments, which is both time- and resource-consuming. Thus, there is a need for fast and accurate computational methods to identify and classify new β-lactamase enzymes from the avalanche of sequence data generated in the post-genomic era. Based on these considerations, we have developed a support vector machine based three-tier prediction system, BlaPred, to predict and classify (as per Ambler classification) β-lactamases solely from their protein sequences. The input features used were amino acid composition, classic and amphiphilic pseudo amino acid compositions. The results show that the classic pseudo amino acid composition-based models performed better than the other models. Following a leave-one-out cross-validation procedure, the accuracy to discriminate β-lactamases from non-β-lactamases was 93.57% (tier-I); accuracies for prediction of class A β-lactamases was 93.27%, 95.52% for class B, 96.86% for class C and 97.31% for class D (tier-II); and at tier-III the accuracies for prediction were 84.78%, 95.65% and 89.13% for subclasses B1, B2 and B3, respectively. The comparative results on an independent dataset suggests that our method works efficiently to distinguish β-lactamases from non-β-lactamases, with an overall accuracy of 93.09%, and is further able to classify β-lactamase sequences into their respective Ambler classes and subclasses with accuracy higher than 92% and 87%, respectively. Comparative performance of BlaPred on an independent benchmark dataset also shows a significant improvement over other existing methods. Finally, BlaPred is available as a webserver, as well as standalone software, which can be accessed at http://proteininformatics.org/mkumar/blapred.

Keywords: Antibiotic resistance; Leave-one-out cross-validation; Pseudo amino acid composition; Support vector machine; β-lactamase.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins* / classification
  • Bacterial Proteins* / genetics
  • Databases, Protein*
  • Sequence Analysis, Protein*
  • Support Vector Machine*
  • beta-Lactam Resistance*
  • beta-Lactamases* / classification
  • beta-Lactamases* / genetics

Substances

  • Bacterial Proteins
  • beta-Lactamases